-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-862] Basic maven jenkins pipeline #13450
Conversation
@mxnet-label-bot add [Java, Scala, CI, Maven] |
Lets wait until after the release before we move forward with this PR as discussed on dev@ :) |
ci/Jenkinsfile_utils.groovy
Outdated
if(clean == 1) | ||
clean = 'git clean -xdf' | ||
else if(clean == 2) | ||
clean = 'git clean -xdff' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a difference between the clean
command in clean == 1 and clean == 2 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clean -ff will remove code inside the .git directory which messes with the deploy process. See https://git-scm.com/docs/git-clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave this as it is.
The workspace always has to be considered empty and reproducible. If you need to persist something in between stages, use the stashing mechanism. With this change you are circumventing a problem - that you are not persisting state properly - and it will blow up as soon as we have more than one restricted slave and job one gets scheduled onto a different slave than job two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not to persist something in between stages. During maven deploy, it attempts to move all code into a temp directory, rebuild, and deploy from that temp directory. The rebuilding calls "git submodule update" which breaks if the .git directory has already been cleaned with "-ff" since the submodule configuration is deleted. I keep the default as "clean -xdff" so that keeping the .git directory must be explicitly chosen and does not affect any other code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd like to have a deeper look into that, so far I'm not following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maven release plugin messes with the project SCM (git). It first tries to create a new tag for the release and a new commit updating the version number located in the POM file. Then, when performing the release, it will checkout the code to a temporary directory and perform the release (building, testing, and deploying) from that new clean directory. This will break if the .git directory has been pruned. You can see docs for the maven release plugin at http://maven.apache.org/maven-release/maven-release-plugin/examples/prepare-release.html
scala-package/examples/pom.xml
Outdated
@@ -119,6 +119,14 @@ | |||
<plugin> | |||
<groupId>org.apache.maven.plugins</groupId> | |||
<artifactId>maven-jar-plugin</artifactId> | |||
<executions> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the child poms not inherit this execution stage from the parent pom file ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the maven-jar-plugin is in the parent pom file because we don't want all of the modules to inherit it
scala-package/packageTest/README.md
Outdated
@@ -0,0 +1,50 @@ | |||
# MXNet Scala Package Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This README is super helpful!
Thanks @zachgk 💯
scala-package/packageTest/Makefile
Outdated
|
||
SCALA_VERSION_PROFILE := 2.11 | ||
SCALA_VERSION := 2.11.8 | ||
MXNET_VERSION := 1.3.1-SNAPSHOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this version remain hardcoded here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will investigate using [1.3.1-SNAPSHOT,) to see if it works. Otherwise, I don't know if there are other options.
Makefile
Outdated
-Dbuild.platform="$(SCALA_PKG_PROFILE)" \ | ||
-Dcflags="$(CFLAGS)" -Dldflags="$(LDFLAGS)" \ | ||
-Dlddeps="$(LIB_DEP) $(ROOTDIR)/lib/libmxnet.a") | ||
|
||
scaladeploylocal: | ||
(cd $(ROOTDIR)/scala-package; \ | ||
mvn deploy $(MAVEN_ARGS) -Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \-DskipTests=true -Dcxx="$(CXX)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why you are skipping the tests here on the deploy stage ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are skipping those tests because we run the tests separately with scalaunittest/scalaintegrationtest before calling deploy and there is no reason to repeat tests
@@ -454,6 +454,10 @@ else | |||
CFLAGS += -DMXNET_USE_LIBJPEG_TURBO=0 | |||
endif | |||
|
|||
ifeq ($(CI), 1) | |||
MAVEN_ARGS := -B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious to know what this -B
flag does :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -B is batch mode. The main change is that it removes the downloading progress messages that are written to the terminal/logs. They take up a lot of space and make the logs hard to read.
4a1a078
to
88cd89e
Compare
Progress
@zachgk Can you rebase this PR? |
88cd89e
to
eaf3229
Compare
I am currently thinking that since these are relatively large commits, I would split them into three different PRs that will create respectively a basic pipeline, a pipeline with multi-environment testing, and the final pipeline integrating the static linking code. After rebasing, I reduced the amount of code as part of this PR (it has some of commit 2), so it should be roughly equal to the code from 20 days ago. This also no longer has code from the packageTest utility because that was merged into master. @mxnet-label-bot update [CI, Java, Maven, Scala, pr-awaiting-review] |
c03e2e2
to
f317589
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution, please indicate the change you would like to make for your second PR
As part of the later PRs, the changes to the current files will include:
Note that this PR forms a working pipeline that will build, test, and then deploy the package for ubuntu 16.04 Please review @nswamy @andrewfayres @piyushghai @marcoabreu |
utils.assign_node_labels(utility: 'restricted-utility', linux_cpu: 'restricted-mxnetlinux-cpu', linux_gpu: 'restricted-mxnetlinux-gpu', linux_gpu_p3: 'restricted-mxnetlinux-gpu-p3', windows_cpu: 'restricted-mxnetwindows-cpu', windows_gpu: 'restricted-mxnetwindows-gpu') | ||
|
||
def nodeMap = ['cpu': NODE_LINUX_CPU, 'gpu': NODE_LINUX_GPU] | ||
def scalaOSMap = ['cpu': 'linux-x86_64-cpu', 'gpu': 'linux-x86_64-gpu'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here as to why we are not doing MacOS through Jenkins ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
f317589
to
0c19e1f
Compare
ci/Jenkinsfile_utils.groovy
Outdated
cleanCmd = 'git clean -xdf' | ||
else if(clean == 2) | ||
cleanCmd = 'git clean -xdff' | ||
sh cleanCmd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate why this is needed? This "super brutal clean" as I like to call it, was requested by @lebeg and he said it's important. Maybe you two could have a discussion there :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I can remove this because we are no longer going to use the maven-release-plugin following updates as part of #13626
def nodeMap = ['cpu': NODE_LINUX_CPU, 'gpu': NODE_LINUX_GPU] | ||
def scalaOSMap = ['cpu': 'linux-x86_64-cpu', 'gpu': 'linux-x86_64-gpu'] | ||
|
||
def wrapStep(nodeToRun, workspaceName, step) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this generator model. Great job!
ci/publish/Jenkinsfile
Outdated
return { | ||
node(nodeToRun) { | ||
ws("workspace/${workspaceName}") { | ||
env.MAVEN_OPTS = "-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn" // Clean up maven logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to move this logic into the runtime functions? We generally try to keep Jenkinsfile limited to process flow design while logic is in the runtime functions. Otherwise, you can't reproduce the individual steps using the docker wrapper because certain variables (like this env var) are set outside of the docker process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I'll move it
scala-package/dev/buildkey.py
Outdated
logging.exception("The request was invalid due to:") | ||
elif client_error.response['Error']['Code'] == 'InvalidParameterException': | ||
logging.exception("The request had invalid params:") | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is the default template of AWS SecretsManager, but could we raise always (instead of the else case) to halt the execution? I have made the experience that letting the process continue to then fail later obfuscated the error message and confuses some people.
In this particular case, the function would return None while the caller expects a tuple. This then throws a confusion message which makes it harder to debug. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this sounds like a good improvement. I'll apply it in docker_cache as well for the sake of consistency
def importASC(key, gpgPassphrase): | ||
filename = os.path.join(KEY_PATH, "key.asc") | ||
with open(filename, 'w') as f: | ||
f.write(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we making sure that this file is deleted when the process finishes? Or is it local to the container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, same applies to the hidden settings.xml
in .m2
folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is put in the ~ directory, which I think that makes it local to the container. Either way, I will add some rm statements for extra security.
scala-package/dev/buildkey.py
Outdated
f.write(key) | ||
subprocess.run(['gpg2', '--batch', '--yes', | ||
'--passphrase=\"{}\"'.format(gpgPassphrase), | ||
"--import", "{}".format(filename)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! Pro tip - I currently have to write a COE because I made the same mistake - don't do that or you will leak credentials :P
Subprocess.run will print the command (in this case including the password) if the execution fails. This then writes the password to the log and you will leak credentials. Is it possible to pass in the password using STDIN or some other redirection method?
I can link you the ticket privately if you're interested in further details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcoabreu Currently, we use fake gpg key in here as it is not necessary to have the real one in the deploy stage. Problems will appear in the release stage as I would not do it with CI. But still it is very useful information, please keep us updated if you find a better solution to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lanking520 We still use this to encrypt the passwords through maven, so this is certainly a problem we need to address.
I worked around this using a combination of passing data through stdin and passing data using the linux expect program (because maven doesn't work with stdin)
scala-package/dev/buildkey.py
Outdated
.format(password)) | ||
|
||
|
||
def severPSW(username, password, gpgPassphrase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "sever" mean? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Apache server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lanking520 "server" was misspelled
0c19e1f
to
80e69fc
Compare
@marcoabreu Could you please review again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these changes. Looks good to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks good. Tested locally with the build steps and all working!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits, but overall looks great :)
scala-package/dev/build.sh
Outdated
|
||
# Scala steps to deploy | ||
make scalapkg CI=1 | ||
make scalaunittest CI=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename lets me assume the task of this script is to build, but it seems like a lot of testing is happening as well.
Generally, we try to split compilation and testing first to make it cleaner, and second to use the most appropriate instance type. Since you also made a test script, could we move these tests over there to keep the compilation jobs as lightweight as possible?
scala-package/dev/deploy.sh
Outdated
|
||
if [[ $MAVEN_PUBLISH_OS_TYPE == "linux-x86_64-cpu" ]]; | ||
then | ||
MAKE_FLAGS="USE_BLAS=openblas USE_DIST_KVSTORE=1 ENABLE_TESTCOVERAGE=1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to compile with test coverage annotations for production builds?
scala-package/dev/deploy.sh
Outdated
MAKE_FLAGS="USE_BLAS=openblas USE_DIST_KVSTORE=1 ENABLE_TESTCOVERAGE=1" | ||
elif [[ $MAVEN_PUBLISH_OS_TYPE == "linux-x86_64-gpu" ]] | ||
then | ||
MAKE_FLAGS="USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 SCALA_ON_GPU=1 SCALA_TEST_ON_GPU=1 USE_DIST_KVSTORE=1 ENABLE_TESTCOVERAGE=1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
80e69fc
to
80f5388
Compare
* Jenkins Publish Nightly Maven Progress * Seperate Build, Test, and Deploy Stages with parallel
* Jenkins Publish Nightly Maven Progress * Seperate Build, Test, and Deploy Stages with parallel
* Jenkins Publish Nightly Maven Progress * Seperate Build, Test, and Deploy Stages with parallel
* Jenkins Publish Nightly Maven Progress * Seperate Build, Test, and Deploy Stages with parallel
* Jenkins Publish Nightly Maven Progress * Seperate Build, Test, and Deploy Stages with parallel
* Jenkins Publish Nightly Maven Progress * Seperate Build, Test, and Deploy Stages with parallel
Description
This is a
WIPPRto share progresson the maven nightly jenkins pipeline. The goal is to create a Jenkins for continuous deployment that will build scala jars, test them on multiple different environments, and then deploy them to the apache snapshot repository. It can also provide some setup for building out additional continuous deployment efforts for other platforms such as pip.Note that this PR depends on #13046 and is rebased on top of it. Those changes are displayed as part of the first commit in the PR.@nswamy @lanking520 @andrewfayres @piyushghai @marcoabreu @yzhliu
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes